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Clustered order of selected sequences: 



2. CDC42FRAG . PEP 

3. BH102FRAG .PEP 

4. SF2FRAG . PEP 
1. MALFRAG . PEP 

5. SYHFRAG.PEP 



(1-107) 
1-107 
(1-107 
(1-1071 
(1-107) 
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SYNTHETIC DNA DERIVED RECOMBINANT HIV ANTIGENS 



BACKGROUND OF THE INVENTION 



The present invention relates to recombinant HIV (Human Immunodeficiency Virus) antigens. Recom- 
binant antigens derived from the molecular cloning and expression in a heterologous expression system of 
the synthetic DNA sequences of the various HIV antigens can be used as reagents for the detection of 
antibodies and antigen in body fluids from individuals exposed to various HIV isolates. 

The nucleotide sequence of the provirai genome has been determined for several HIV isolates, 
including HIV-1 strains HTLV-III (Ratner et al., Nature (1985) 313:277); ARV-2 (Sanchez-Pescador et al, 
Science (1985) 227:484); LAV (Wain-Hobson et ah, Cell (1985) 40:9); and CDC-451 (Desai et al., Proc. 
Natl. Acad. Sci. USA (1986) 83:8380). The nucleotide sequence of the HIV-2 ROD isolate was reported by 
Guyader et al. {Nature (1987) 326:662). 

HIV antigens have been obtained from the virus grown in tissue culture, or from a molecularly cloned 
genomic fragment expressed in heterologous hosts such as Escherichia coli. The tissue culture derived 
virus involves the cumbersome and often difficult process of growing virus infected celts in stringent sterile 
conditions. Further, the virus derived from tissue culture is infectious, and, therefore is hazardous to the 
health of individuals involved in propagation and purification. The expression of molecularly cloned HIV 
genomic fragments overcomes the biohazard problem. Generally, an HIV genomic fragment from a single 
HIV isolate with mammalian codons is expressed in a heterologous system, such as, bacteria or yeast, and 
is limited to the use of available restriction sites present in the viral genome for cloning and expression. 

It has been difficut to obtain expression in heterologous systems of some of the HIV proteins, such as 
the HIV-1 envelope antigen gp41. Several researchers have tried deleting the hydrophobic regions of the 
HIV-1 gp41 to increase expression levels. UK Patent Application GB 2188639 discloses an HTLV-III gag/env 
gene protein wherein the env fragment of the DNA sequence deleted codons corresponding to the first 
hydrophobic region of the gp41 protein. U.S. Patent No. 4,753,873 discloses a peptide fragment that is 
encoded by a nucleotide sequence wherein the nucleotides coding for a first and second hydrophobic 
region of HTLV-III gp4l are deleted. 

Poor expression can be the result of many factors, including the specific nucleic acid sequence of the 
gene to be expressed, the mammalian codons of the gene sequence to be expressed may not be efficiently 
transcribed ' and translated in a particular heterologous system, and the secondary structure of the 
transcribed messenger RNA. The use of synthetic DNA fragments can increase expression in heterologous 
systems. 



SUMMARY OF THE INVENTION 



Recombinant antigens which are derived from the molecular cloning and expression of synthetic DNA 
sequences in heterologous hosts are provided. Synthetic DNA sequences coding for the recombinant 
antigens of the invention are further provided. The synthetic DNA sequences selected for expression of 
various HIV antigens are based on the amino acid sequence of either a single isolate or several isolates, 
optimized for expression in Escherichia coli by specific codon selection. The synthetic DNA sequence gives 
higher expression of the particular antigen encoded. These antigens can be substituted for viral antigens 
derived from tissue culture for use as diagnostic and therapeutic reagents. 

The present invention can be utilized to synthesize full length HIV transmembrane envelope gene using 
bacterial codons. Another aspect of the invention involves the linkage of sequences which are poorly 
expressed as individual proteins, to sequences which are expressed with high efficiency. The combination 
of the sequence of the entire coding region of a gene of one virus with coding sequences of another gene 
from a different virus to produce a fusion protein can be achieved. The fusion proteins thus expressed have 
a unique advantage of antigenic epitopes of two viral antigens. 

The present invention includes full length synthetic genes (FSG) for HIV-1 and HIV-2 transmembrane 
glycoprotein (TMP). 



DESCRIPTION OF THE DRAWINGS 
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Fig. 1 illustrates the alignment of the TMP fragment encoding amino acid residue nos 552-668 of 
HIV-1 with the sequences of the four different isolates used to derive the amino acid sequence of BS2-10 

Fig. 2 illustrates the assembly of 16 oligonucleotides to form the synthetic TMP fragment of Fia 1 
and its cloning into pUC1 8, designated BS2-10. ' 

Fig. 3 illustrates the DNA and amino acid sequence of FSG, indicating the restriction sites and 
subfragments used for assembly. 

Fig. 4 is a comparison of the amino acid sequence used to develop the synthetic HIV-1 envelope 
gene with known amino acid sequences of 13 independent isolates, indicating all linker-derived sequences 
( + ) and amino acid substitutions ("). 

Fig. 5 is a schematic diagram of the assembly and cloning of the major subfragments to form FSG in 
pU(-/1 8. 

Fig. 6 is a schematic diagram of the cloning of FSG into lambda pL expression vectors to generate 
pSD301 and pSD302. 

Figs. 7 illustrates the amino acid sequences of pSD301 and pSD302, indicating all linker-derived 
sequences ( + ) and amino acid substitutions ("). 

Fig. 8 illustrates results of expression analysis of pSD301 and pSD302. A) Coomassie stained qel- EO 
Immunoblot using AIDS patients' sera. ' 

Fig. 9 illustrates the DNA and amino acid sequence of the full length synthetic HIV-2 TMP indicating 
restriction enzymes used to assemble the gene including linker sequences at both ends to facilitate cloning 

Fig. 10 illustrates the three major subfragments used to construct the synthetic HIV-2 TMP gene 

Fig. 11 is a schematic diagram of the assembly of the major subfragments to form the full lenqth 
synthetic HIV-2 TMP and its cloning into pUC8 to generate pJC28. 

Fig. 12 is a schematic diagram of the cloning of synthetic HIV-2 TMP fragment A into pUCl9 to 
generate pJC22 and into pTB210 to generate pJClOO. 

Fig. 13 is a schematic diagram of the cloning of synthetic HIV-2 TMP into lambda pL expression 
vectors to generate pSD306 and pSD307. 

Fig. 14 indicates the specific amino acid sequences of pL constructs pSD306 and pSD307 indicating 
all linker sequences, HIV-1 gag sequences, and HIV-2 TMP sequences. 

Fig. 15 illustrates results of expression analysis of pSD306 in E. coli CAG456 cells. A) Coomassie 
stained gel; B) Immunoblot using HIV-2 positive human sera. 

Fig. 16 illustrates results of expression analysis of pSD307 in E. coli pRK248.clts/RR1 cells A) 
Coomassie stained gel; B) Immunoblot using HIV-2 positive human sera. 

DETAILED DESCRIPTION OF THE INVENTION 



Synthetic DNA fragments of the HIV genome can be synthesized based on their corresponding amino 
acid sequences. By comparing the particular region of interest between different isolates, a sequence can 
be selected which is different from any sequence that exists in nature, because the sequence is a 
compilation of the sequences from various isolates. For example, the synthetic HIV-1 envelope protein 
described in Example 1, is based on the amino acid sequence of four different HIV 1 isolates namely 
HTLV-IIIB, LAV-1, ARV-2 and CDC-451. y ' 

Other properties can be built into the sequence. For example, codons can be switched for optimal 
expression in bacteria or yeast, specific restriction sites can be introduced, and other restriction sites can 
be removed. In addition, the sequence should have specific restriction sites at both 5' and 3' ends of the 
fragment to facilitate cloning in a particular vector. Synthetic DNA fragments can be synthesized as follows- 
(1) select an unique protein sequence, (2) reverse translate to determine complementary DNA sequence (3) 
optimize codons for bacterial or yeast expression, and (4) introduce and/or remove specific restriction sites 

Srxty-one dist.nct nucleotide codons code for 20 amino acids giving rise to the degeneracy in the 
genetic code. Researchers have reported the frequencies of codons used in nucleic acids for both 
eukaryotic and prokaryotic organisms. (Grantham et al., Nucleic Acids Res. [1980] 9t43- Gouy et al 
Nucleic Acids Res. [1982] 10:7055; Sharp et al., Nucleic Acids Res. [1986] 14:7737.) Sequences from the 
entire E. coli genome, with examples of sequences from the chromosome, transposons, and plasmids have 
been analyzed. These sequences code for structural proteins, enzymes and regulatory proteins. Correlation 
has been shown between the degree of codon bias within a particular gene and the level of aene 
expression. y 
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It is preferred that the same codon triplet for each particular amino acid of the synthetic DNA sequence 
be used. However, alternative codon(s) can be used to add or delete a particular restriction site. The 
sequence should include unique restriction sites which can be used to delete a specific fragment or 
sequence, or substitute a particular sequence in case of an error in the original synthesis. 

Vector systems which can be used include plant, bacterial, yeast, insect, and mammalian expression 
systems. It is preferred that the codons are optimized for expression in the system used. The proteins and 
polypeptides provided by the invention, which are cloned and expressed in heterologous systems, as 
described above, can be used for diagnostic and therapeutic purposes. 

A preferred expression system utilizes the lambda pL vector system. This expression system has the 
following features: (1) a strong lambda pL promoter, (2) a strong three-frame translation terminator rrnBM, 
and (3) translation starts at an ATG codon, eight base pairs from the ribosome binding site located within an 
accessible Ncol restriction site. 

Another advantage of the expression system of the present invention is that one can customize the 
synthetic DNA fragments so they contain specific DNA sequences which express proteins with desired 
amino acid sequences, and further allows one the capability of adding, at either the 5 or 3 end, other DNA 
sequences to facilitate the transfer of synthetic fragments into various vectors. 

Additionally, the use of particular restriction sites at both ends of the fragment may also facilitate 
incorporation of the fragment into other sequences to generate fusion proteins, which can also be used as 
diagnostic and therapeutic reagents. For example, the HIV-1 gp41 sequence can be incorporated within or 
at the end of core/surface antigen of the hepatitis B viral sequence to generate a fusion protein which can 
be used in a single assay screening system for the detection of both AIDS and Hepatitis B in prospective 
blood donors. Alternatively, the assay can be used to track the course of a patient's infection. 

Other proteins from any source, including bacterial, yeast, insect, plant or mammalian, can be used with 
the synthetic DNA fragments of the invention to produce fusion proteins. Those which are expressed 
efficiently in their respective expression systems are especially preferred because they can enhance the 
expression of the synthetic fragment of the fusion protein. 

The synthetic DNA sequences of the present invention, derived from several HIV isolates and optimized 
for expression in E. coii\ provides continuous availability and uniformity of HIV antigen preparations which 
will recognize test sera from individuals exposed to genetically distinguishable variant HIV isolates. The 
recombinant antigen expression is very stable since E. coli codons have been used for its synthesis. 
Moreover, the insertion of specific restriction sites allows addition, deletion, or substitution in important 
antigenic epitopes in the coding sequences, a property difficult to achieve when naturally occurring HIV 
sequences are utilized for expression. Construction of synthetic genes also allows the addition of protein 
sequences at either amino- or carboxyl- terminus of the gene thereby allowing the development of novel 
reagents. For example, a fusion gene can be produced comprising a fusion between HIV-1 core antigen and 
HIV-1 envelope synthetic gene. More specifically the envelope synthetic gene comprises the carboxyl- 
terminus HIV-1 gp120 sequence and the full length HIV-1 gp4L Similarly, the HIV-1 core antigen DNA 
sequence can be fused to the H1V-2 gp41 sequences, both of which can be expressed at high levels in 
heterologous host systems such as E. coli. 

E. coli strains containing plasmids useful for constructs of the invention have been deposited at the 
American Type Culture Collection, Rockville, Maryland, on November 22, 1988, under the accession nos. 
ATCC 67855 (pSD301/RR1/pRK248.clts) and ATCC 67856 (pSD306/CAG456). 

The following examples further describe the invention. The examples are not intended to limit the 
invention in any manner. 



Reagents and enzymes 

Media such as Luria-Bertani (LB) and Superbroth II (Dri Form) were obtained from Gibco Laboratories 
Life Technologies, Inc., Madison, Wisconsin. Restriction enzymes, Klenow fragment of DNA polymerase I, 
T4 DNA tigase, T4 polynucleotide kinase, nucleic acid molecular weight standards- M13 sequencing 
system, X-gal (5-bromo-4-chloro-3-indonyl-j9-D-galactoside), IPTG (isopropyl-0-D-thiogalactoside), glycerol. 
Dithiothreitol, 4-chioro-l-napthol were purchased from Boehringer Mannheim Biochemicals, Indianapolis, 
Indiana; or New England Biotabs, Inc.. Beverly, Massachusetts; or Bethesda Research Laboratories Life 
Technologies, Inc., Gaithersburg, Maryland. Prestained protein molecular weight standards, acrylamide 
(crystallized, electrophoretic grade >99%); N-N-Methylene-bis-acryiamide (BIS); N.N.N ,N ,- 
Tetramethyiethylenediamine (TEMED) and sodium dodecylsulfate (SDS) were purchased from BioRad 
Laboratories, Richmond, California. Lysozyme and ampicillin were obtained from Sigma Chemical Co., St. 
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Louis, Missouri. Horseradish peroxidase (HRPO) labeled secondary antibodies were obtained from Kir- 
kegaard & Perry Laboratories, Inc., Gaithersburg, Maryland. Seaplaque agarose (low melting agarose) was 
purchased from FMC Bioproducts, Rockland, Maine. 

T50E10 contained 50 mM Tris, pH 8:0, 10 mM EDTA; 1X TG contained 100 mM Tris, pH 7.5 and 10% 
5 glycerol; 2X SDS/PAGE loading buffer consisted of 15% glycerol, 5% SDS. 100 mM Tris base, 1M 0- 
mercaptoethanol and 0.8% Bromophenol blue dye; TBS contained 50 mM Tris, pH 8.0, and 150 mM 
sodium chloride; Blocking solution consisted of 5% Carnation nonfat dry milk in TBS. 



w Host cell cultures. DNA sources and vectors 

E. coli JM103 cells, pUC8, pUC18, pUC19 and M13 cloning vectors were purchased from Pharmacia 
LKB Biotechnology, Inc., Piscataway, New Jersey; Competent Epicurean tm coli strains XL1-Blue and JM109 
were purchased from Stratagene Cloning Systems, La Jolla, California. RR1 celts were obtained from Coli 
15 Genetic Stock Center, Yale University, New Haven, Connecticut; and E. coli CAG456 cells from Dr. Carol 
Gross, University of Wisconsin, Madison, Wisconsin. Vector pRK248.cits was obtained from Dr. Donald R. 
Heiinski, University of California, San Diego, California. 



20 General methods 

All restriction enzyme digestions were performed according to suppliers' instructions. At least 5 units of 
enzyme were used per microgram of DNA, and sufficient incubation was allowed to complete digestions of 
DNA. Standard procedures were used for mini cell lysate DNA preparation, phenol-chloroform extraction, 
25 ethanol precipitation of DNA, restriction analysis of DNA on agarose, and low melting agarose gel 
purification of DNA fragments (Maniatis et al., Molecular Cloning. A Laboratory Manual [New York: Cold 
Spring Harbor, 1982]). Ptasmid isolations from E. coli strains used the alkali lysis procedure and cesium 
chloride-ethidium bromide density gradient method (Maniatis et ah, supra). Standard buffers were used for 
T4 DNA ligase and T4 polynucleotide kinase (Maniatis et al., supra). 



EXAMPLES 



35 

Example 1 



40 Cloning strategy of codon-optimized synthetic HIV-1 envelope protein 



In order to develop a synthetic gene encoding the HIV-1 envelope glycoprotein and fragments thereof, 
the amino acid sequences of four independent HIV-1 viral isolates designated as HTLV-IIIB (BH102), LAV-1 

45 (MAL), ARV-2 (SF), and CDC-451 (CDC42) were compared. A unique amino acid sequence from the four 
isolates (Fig. 1) was selected to derive a fragment with amino acid residues nos. 552-668 (numbering by 
Ratner et al., supra). This fragment contained nine amino acid substitutions (8%) as compared to the HTLV- 
IIIB (BH102) isolate. This amino acid sequence was reverse translated using codons optimized to facilitate 
high level expression in E. coli The ambiguous nucleotides remaining in the second and/or third base of 

so the codon were assigned to facilitate molecular cloning, and the addition, substitution, or deletion of 
sequences. The DNA sequence was then subdivided into eight double stranded fragments with unique 6 bp 
overhangs to direct specific annealing. The sixteen individual oligonucleotides were synthesized on Applied 
Biosystem 380A DNA synthesizer using methods and reagents recommended by the manufacturer. These 
purified oligonucleotides were annealed and ligated together to assemble the entire fragment which was 

55 digested with BamHI and Sail, ligated into pUCl8 and transformed into E. coli JM103 cells. A clone 
designated BS2-10 (Fig. 2) was isolated, restriction mapped and its DNA sequence confirmed using the 
Sanger dideoxy chain termination method (Sanger et al., J. MoL Blot (1982) 162:729). 

In order to establish that clone BS2-10 expressed this unique HIV-1 transmembrane protein (TMP) 

5 
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fragment, the BS2-10/JM103 culture was grown at 37° C in 50 ml Luria Broth, in a 250 ml Erlenmeyer flask. 
When the cultures reached an OD600 of 0.3-0.5, IPTG was added to a final concentration of 1 mM to 
induce expression. Samples (1.5 ml) were removed at 1 hr intervals, and the cells were pelleted and 
resuspended to an OD600 of 10.0 in 2X SDS/PAGE loading buffer. Aliquots (15 ul) of the prepared samples 

5 were loaded on a 15% SDS/PAGE gel, and the proteins were separated and then electrophoreticaily 
transferred to nitrocellulose for immunoblotting. The nitrocellulose sheet containing the transferred proteins 
was incubated with Blocking Solution for one hour and incubated overnight at 4° C with AIDS patients' sera 
diluted in TBS containing 5% E. coli JM103 lysate. The nitrocellulose sheet was washed three times in 
TBS, then incubated with HRPO-labeled goat anti-human IgG, diluted in TBS containing 10% fetal calf sera. 

io The 'nitrocellulose was washed three times with TBS and the color was developed in TBS containing 2 
mg/ml 4-ch!oro-1-napthol, 0.02% hydrogen peroxide and 17% methanol. Clone BS2-10 demonstrated a 
strongly immunoreactive band with AIDS patients' sera indicating that the synthetic HIV-1 TMP fragment 
was expressed in E. coli. In order to assemble the full length HIV-1 transmembrane protein, as welt as the ^ 
extreme carboxyl-terminal 37 amino acids of gpl20, the amino acid sequences of the four isolates * 

75 described previously were compared to each other to derive a unique amino acid sequence for this gene.^ 
After this unique amino acid sequence was reverse translated using codons optimized for E. coli 
expression, the ambiguous nucleotides were assigned as previously described. The full length synthetic 
HIV-1 envelope gene (FSG) was divided into six additional subfragments. The complete DNA and amino 
acid sequence of FSG is shown in Fig. 3, indicating the restriction sites and subfragments used for 

20 assembly. Fig. 4 is a comparison of the amino acid sequence used.to develop the synthetic HIV-1 envelope 
gene with known amino acid sequences of 13 independent isolates reported in the Los Alamos HIV Data 
Bank (Meyers et aL, Human Retroviruses and AIDS (1987), Los Alamos National Laboratory). The Genalign 
program of Intelligenetics was used to align these sequences, and the alignment demonstrates that FSG 
(designated SYNGENE in Fig. 4) retains substantial overall sequence homology compared to other known 

25 isolates. Alignment parameters and alignment scores of the individual sequences are also shown. 



Synthesis and cloning of subfragments 

30 The subfragments located downstream from BS2-10, designated 413-1 through 413-4. were synthesized 
along with additional sequences containing a BamHI restriction site at the 5 end and a Hindlll restriction 
site at the 3' end to facilitate molecular cloning and DNA sequence analysis of the individual subfragments. 
The subfragments located upstream of BS2-10 were also synthesized with additional sequences containing 
restriction sites useful for cloning and DNA sequence analysis. The subfragment encoding the carboxyl- 

35 terminal gpl20 amino acid sequence, designated c-term gp120, contained EcoRI and BamHI restriction 
sites on the 5 end and Bglil and Smal restriction sites on the 3 end. Similarly, subfragment 415 contained 
a Bglf! site on the s' end and Bglll and BamHI restriction sites on the 3' end. With the exception of the c- 
term gp120 subfragment, in which both strands were synthesized as described for BS2-10, the remaining 
subfragments of FSG were synthesized by a method utilizing the Klenow fragment of DNA polymerase \. In 

40 this method, oligonucleotides comprising opposite strands of a particular subfragment, which contained ten 
complementary bases, were synthesized and annealed. The second complementary strand was then filled 
in by the Klenow fragment of DNA polymerase I in the presence of the four deoxynucleotides in a manner 
similar to that described by Sanger et aL, supra, for DNA sequencing. The resulting double-stranded 
subfragment was then digested with the appropriate restriction enzymes and cloned into pUC vectors to 

45 confirm the DNA sequence, as previously described. Subfragments 413-1 through 413-4 were cloned into 
pUC18 using the BamHI and Hindlll restriction sites common to all. Subfragment c-term gp120 was cloned 
into pUC8 using the EcoRI and Smal restriction sites. Subfragment 415 was cloned into the plasmid 
containing c-term gp120 at the Bglll restriction site and screened for proper orientation by restriction 
mapping. The plasmid DNAs for all subfragments were prepared by the cesium chloride buoyant density 

so gradient method and the individual DNA sequences were confirmed directly from the double-stranded 
template (Hattori et aL, NucL Acid Res. (1985) 13:7813). 



55 



Assembly and cloning of FSG 

Subfragments located downstream from BS2-10 were cloned in a stepwise fashion utilizing unique 
internal restriction sites at the 5 end and a common Hindlll site at the 3 end. For example, subfragment 
413-1 was cloned into BS2-10 at the Sail and Hindlll restriction sites to generate clone BS2-10A, into which 
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413-2 was inserted at the Hpa! and Hindlll restriction sites to generate clone BS2-108. Similarly, subfrag- 
ments 413-3 and 413-4 were added using unique EcoRV and SnaBI restriction sites, respectively. The two 
subfragments located upstream of clone BS2-10, having been cloned together in pUC8, were ligated to 
BS2-10 as a BamHI fragment. Fig. 5 shows the cloning method used to assemble the synthetic HIV-1 
envelope gene in pUC18. The final clone, designated FSG, was restriction mapped to confirm the proper 
orientation of the BamHI-BamHI fragment. 



Example 2 

10 



Cloning and expression of FSG in lambda pL Vector Systems 



75 

Expression analysis of FSG was carried out in vector systems utilizing the strong lambda pL promoter 
and the temperature sensitive cl repressor gene (Benard et al.. Gene (1979) 5:59). The specific vectors 
used in these analyses are derivatives of pBR322, containing a lambda pL promoter and a synthetic Shine- 
Dalgarno sequence, followed by restriction sites used for cloning various genes of interest In addition, 

20 these vectors contain the strong three-frame translation terminator rrnBtl. Vector pSDKR816 contains a 
Ncoi restriction site which provides an ATG start codon optimally spaced from the start of transcription. Fig. 
6 schematically presents the cloning of FSG into pSDKR816 to generate clone pSD301. Briefly, FSG was 
digested with Hindlll and Smal, the ends were made blunt by filling in with the Klenow fragment of DNA 
polymerase I, and the 1209 bp fragment was purified and ligated into pSDKR816 at the Ncol site filled in 

25 with the Klenow fragment of DNA polymerase i. After transformation into E. coli RR1 ceils containing the 
cits gene on the compatible vector pRK248, a clone with FSG in the proper orientation was isolated by 
restriction mapping and designated pSD30l. The specific amino acid sequence encoded by pSD301 is 
presented in Fig. 7 indicating ail linker derived sequences ( + ) and all amino acid substitutions within the 
HIV-1 envelope sequences not yet identified in any published sequence ("). 

30 Additionally, FSG was cloned as a fusion to the HIV-1 gag protein (amino acid residue nos. 121-407, 
numbering by Ratner et al., supra) which is highly expressed under control of the lambda pL promoter in 
vector pKRR955. FSG was digested with Aval, the ends were made blunt by filling in with the Klenow 
fragment of DNA polymerase I, and the 1199 bp fragment was purified and ligated into pKRR955 at the 
Smal restriction site to form an HIV-1 gag/synthetic env fusion protein (Fig. 6). After transformation into £. 

35 coli pRK248.cits/RR1 cells, a clone containing FSG in the proper orientation was identified by restriction 
mapping and designated pSD302. The specific amino acid sequence of this fusion protein is presented in 
Fig. 7 indicating all linker derived sequences, HIV-1 gag sequences, and HIV-1 envelope sequences as 
previously described. 

Fifty ml cultures of pSD301 and pSD302 in E. coli pRK248.clts/RR1 cells were grown in Superbroth II 

40 media at 30 "C to an OD600 of 0.5, at which time the cultures were shifted to 42 °C to inactivate the 
temperature sensitive cl repressor and thereby induce expression off the lambda pL promoter. Two samples 
(2.0 ml each) were removed at 1 hr intervals. Sample preparation was as follows. 

The cells were pelleted, then resuspended in either 1X TG buffer or T50E10 buffer. An equal volume of 
2X SDS/PAGE loading buffer was added to the 1X TG suspended cells to produce the whole lysate. The 

45 sample resuspended in T50E10 was sonicated eight times for 30 seconds each, at a power setting of 10 
watts, using the microtip provided with the Vibra Cell Sonicator (Sonics and Materials, Inc., Danbury, CT). 
The sonicated sample was then centrifuged to remove the insoluble fraction which was resuspended in the 
original starting volume of T50E10. An equal volume of 2X SDS/PAGE loading buffer was added to both the 
sonicated soluble and insoluble fractions, which together with the whole cell lysate, were boiled for 5 min, 

so centrifuged to remove any remaining insoluble material, and aliquots (15ul) were separated on duplicate 
12.5% SDS/PAGE gels. Proteins from one such gel were electrophoretically transferred to nitrocellulose for 
immunoblotting with AIDS patients' sera, as previously described. The second gel was fixed in a solution of 
50% methanol, 10% acetic acid for twenty minutes at room temperature, and then stained with 0.25% 
Coomassie blue dye in a solution of 50% methanol. 10% acetic acid for 30 minutes. Destaining was carried 

55 out using a solution of 10% methanol, 7% acetic acid for 3-4 hr, or until a clear background was obtained. 

Fig. 8 presents the expression of pSD301 and pSD302 prior to (TO) and four hours post (T4) induction, 
analyzed by Coomassie blue staining (Fig. 8A) and immunoblotting (Fig. 8B). Samples were pKRR955 (TO 
whole cell lysate [lane 1], T4 whole cell lysate [lane 2]); pSD301 (TO whole cell lysate [lane 3], T4 whole 

7 
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ceil lysate [lane 4], T4 sonicated soluble fraction [lane 5], and T4 sonicated insoluble fraction [lane 6]); and 
pSD302 (TO whole cell lysate [lane 7], T4 whole cell lysate [lane 8], T4 sonicated soluble fraction [lane 9], 
and T4 sonicated insoluble fraction [lane 10]). Molecular weight standards were run in lane 11. Arrows 
indicate the position of the induced proteins which are clearly visualized in both the whole cell lysate and 

5 the sonicated insoluble cell fraction by Coomassie blue staining (Fig. 8A). Lane 2 indicates that pKRR955 
expressed the HIV-1 gag protein at a level greater than 25% of total cellular protein, lane 4 indicates that 
pSD301 expressed the synthetic HIV-1 envelope protein at a level of approximately 12% of total cellular 
protein, and lane 8 indicates that pSD302 expressed the HIV-1 gag/synthetic env fusion protein at a level of 
approximately 5% of total cellular protein. The expression levels obtained using FSG were significantly 

w higher than those obtained using the corresponding native viral DNA sequences in similar pL vector 
systems. All three recombinant proteins were highly reactive with AIDS patients' sera (Fig. 8B). This data 
demonstrates that the synthetic HIV-1 envelope gene, including the hydrophobic region of the transmem- 
brane protein, can be efficiently expressed in E. coll, and the expressed proteins are highly immunoreac- 
tive. 

75 

Example 3 



20 



Synthesis and cloning of synthetic HIV-2 TMP and fragment thereof 

The entire HIV-2 transmembrane protein (TMP) was chemically synthesized using the method of 

25 oligonucleotide directed double-stranded break repair disclosed in U.S. Patent Application Serial No. 
883,242, filed July 8, 1986 by Mandecki (EPO 87109357.1), which is incorporated herein by reference. 
Envelope amino acid residues 502-858 of the HIV-2 ROD isolate (numbering by Guyader et a!., supra) were 
reverse translated using codon assignments optimal for expression in E. coli. After specific nucleotides 
were assigned -to the remaining ambiguous nucleotides, as previously described, the full length TMP 

30 sequence was generated as indicated in Fig. 9. The synthetic gene was assembled and cloned as three 
separate subfragments represented by fragment A, a 335 bp Hindlil-Ncol fragment, fragment B, a 309 bp 
Ncol-BamHI fragment (29 hydrophobic amino acid residues deleted), and fragment C, a 362 bp BamHI- 
Hindlll fragment, as depicted in Fig. 10. A fourth fragment containing the deleted twenty-nine hydrophobic 
amino acid residues was cloned into the 309 bp Ncol-BamHI fragment as an EcoRV-SnaBI fragment (Fig. 

35 10). The three major subfragments were cloned into pUC vectors, transformed into JM109 cells and their 
primary nucleotide sequences confirmed, as previously described. The fragments were then gel-purified 
and ligated together to form the 1089 bp full length synthetic HIV-2 TMP. This 1089 bp Hindlll fragment 
was cloned into pUC8 and designated pJC28 (Figure 11). 

Fragment A encoding the amino terminal 108 amino acids of HIV-2 TMP (from Tyr 502 to Trp 609 

40 [Guyader et aL. supra]) was cloned at the Hindlll-Sall sites of pUC19. A clone, designated pJC22, was 
identified by restriction mapping and its primary nucleotide sequence was confirmed. Plasmid pJC22 was 
digested with Hindlll-Asp718 to release a 361 bp fragment containing the synthetic HIV-2 TMP gene 
fragment which was ligated into the Hindlll-Asp718 sites of plasmid pTB2lO and transformed into E. coli 
XL1 cells. Plasmid pTB210 is disclosed in a U.S. Patent Application entitled "CKS Method of Protein 

45 Synthesis", concurrently filed by T. Boiling et al., which is a CIP of an earlier application, U.S. Serial No. 
167,067, filed March 11, 1988, which is hereby incorporated by reference. A clone, designated pJC100 (Fig. 
12),' was isolated and restriction mapped to identify the hybrid gene of kdsB (encoding CKS) and HIV-2 
TMP fragment. 

Example 4 
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5 5 Cloning of synthetic HIV-2 TMP in lambda pL vectors 

The 1089 bp Hindlll fragment containing the entire HIV-2 TMP was isolated from pJC28, filled in with 
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the Klenow fragment of DNA polymerase I to produce blunt ends and cloned directly behind an ATG start 
codon provided by the filled in Ncol site of pSD305 (pSDKR816 previously described with cits inserted). 
Similarly, an 1097 bp Sall-Asp718 fragment containing the entire HIV-2 TMP was isolated from pJC28, filled 
in with the Klenow fragment of DNA polymerase I to produce blunt ends and cloned at the Smal site of 
5 pKRR955 (previously described) to produce an HIV-1 gag/HIV-2 TMP fusion protein. The clone containing 
the HIV-2 TMP gene under control of the lambda pL promoter was designated pSD306 and the clone 
containing the HIV-2 TMP as a fusion to HIV-1 gag under control of the lambda pL promoter was 
designated pSD307 t as outlined in Fig. 13, After transformation of pSD306 into E. coli CAG456 cells (Baker, 
PNAS (1984) 81:6779) and pSD307 into E. coli pRK248.clts/RRl cells, single cell clones were isolated and 

io restriction mapped to demonstrate the presence and orientation of the HIV-2 TMP gene. The specific amino 
acid sequences of pSD306 and pSD307 are presented in Fig. 14, indicating linker derived sequences, HIV-1 
gag sequences, and HIV-2 TMP sequences. Expression of the synthetic HIV-2 TMP gene was induced in 
these cultures by temperature shift methods, as previously described. AJiquots of the cultures before and 
after induction were subjected to SDS/PAGE analysis for both Coomassie blue staining and immunoblotting 

15 using HIV-2 positive human sera, as previously described for the synthetic HIV-1 envelope gene product. 
Whole cell lysates and the sonicated soluble and insoluble fractions of the cultures were analyzed and are 
illustrated in figures 15 and 16 for the pSD306 and pSD307 constructs, respectively. 

Fig. 15 presents the expression of pSD306 prior to (TO) and two hours post (T2) induction, analyzed by 
Coomassie blue staining (Fig. 15A) and immunoblotting (Fig. 15B). Samples were TO whole cell lysate (lane 

20 1); TO sonicated soluble fraction (lane 2); TO sonicated insoluble fraction (lane 3); T2 whole cell lysate (lane 
4); T2 sonicated soluble fraction (lane 5); T2 sonicated insoluble fraction (lane 6); and BioRad prestained 
molecular weight markers (lane M). Fig. 15 demonstrates that pSD306 expressed a significant amount of 
the HIV-2 TMP at time T2, as indicated by the arrows on both the Coomassie stained gel and the 
immunoblot. This expressed protein is visible in both the whole cell lysate as well as the sonicated insoluble 

25 cell fraction of these cultures. 

Similarly, Fig. 16 presents the expression of pSD307 prior to (TO) and two hours post (T2) induction, 
analyzed by Coomassie blue staining (Fig. 16A) and immunoblotting (Fig. 16B). Samples were pKRR955, 
T2 whole cell lysate (lane 1)> pSD307, TO whole cell lysate (lane 2), TO sonicated soluble fraction (lane 3), 
TO sonicated insoluble fraction (lane 4), T2 whole cell lysate (lane 5), T2 sonicated soluble fraction (lane 6), 

30 12 sonicated insoluble fraction (lane 7); and BioRad prestained molecular weight markers (lane M). Fig. 16 
demonstrates that pSD307 expressed a significant amount of the HIV-1 gag/HIV-2 TMP fusion protein at 
time T2, as indicated by the arrows on both the Coomassie stained gel and the immunoblot. This fusion 
protein is also visible in both the whole cell lysate and the sonicated insoluble fraction of these cultures. 
The HIV-1 gag fusion partner (lane 1), although present at higher levels than the HIV-1 gag/HIV-2 TMP 

35 fusion protein, showed lower immunoreactivity to HIV-2 specific antibodies. 



Example 5 

40 

Diagnostic utility of synthetic DNA derived HIV proteins 



45 The HIV specific proteins overexpressed in E. coli were purified using procedures known in the art. The 
proteins expressed at high levels were immunogenic and were recognized by antibodies produced in HIV- 
infected individuals (see figs. 8. 15 and 16). The HIV specific proteins derived from E. coli can be utilized in 
several immunoassay configurations, as described in CIP application U.S. Serial No. 020,282, filed February 
27, 1987 by Dawson et a/., which is hereby incorporated by reference. The parent application is EPO 

so 86116854.0 (December 4, 1986). In a preferred configuration, HIV specific proteins were coated on solid 
support and incubated with test samples. The virus specific antibodies present in the test sample 
recognized and were bound to the HIV proteins on the solid support. The HIV specific antibodies were 
quantitated by the use of goat anti-human immunoglobulin conjugated to HRPO. 

The HIV-1 exposed individuals were detected by the use of HIV-1 specific proteins, such as HIV-1 gp4l 

55 and HIV-1 p24 proteins derived by recombinant DNA techniques, described in the CIP application Serial No. 
020,282. However, only 70 to 90% of the HIV-2 exposed individuals were detected using these HIV-1 
specific proteins, due to cross reactivity between the two strains. The HIV-2 exposed individuals which were 
not detected using these HIV-1 specific proteins were detected using synthetic DNA derived HIV-2 proteins. 
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For example, the HIV-2 IMP fragment fused to CKS (pJC100) when supplemented to the recombinant 
HIV-1 proteins on the solid support described above significantly increased the detection of test samples 
containing HIV-2 antibodies as illustrated in Table 1, below. 



Table 1 



w 



15 





HIV-1 Test 


HIV-1 /'HIV-2 
Test 


Samples Tested" 


127 


127 


Non Reactive 


26 


0 




(20.47%) 


(0%) 


Reactive 


101 


127 




(79.53%) 


(100%) 



* All 127 samples were confirmed positive for the presence of HIV-2 
antibodies by western blot analysis using disrupted HIV-2 virus. 



20 Additionally, 3,411 normal blood donors were screened using the HIV-1 /HIV-2 recombinant assay described 
above. The recombinant assay demonstrated a specificity of 99.77%, with only eight (0.23%) initial reactive 
and four (0.12%) repeat reactive samples. 



25 
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Differentiation of HIV-1 and HIV-2 infections 

Frequently, individuals who have been exposed to HIV-2 have antibodies directed against epitopes on 
HIV-2 proteins which are also present on HIV-1 proteins. Likewise, individuals who have been exposed to 
HIV-1 have antibodies which cross-react with HIV-2 proteins. Because most of the cross-reactions are 
related to the gag gene products, the pJP100 protein and a recombinant protein from HIV-1 envelope 
protein (described in C!P Application Serial Number 020,282) were used to differentiate between individuals 
infected with HIV-1 and HIV-2. 

Two independent enzyme-linked immunoassays were developed. Test 1 used HIV-1 recombinant 
proteins coated upon a solid phase. Test 2 used HiV-2 TMP (pJP100) coated upon a solid phase. 
Specimens from HIV seropositive individuals from the United States, Portugal or West Africa were tested for 
antibodies using these two tests. Endpoint titers were determined by diluting the specimens in normal 
human plasma and testing the dilutions. As illustrated in Table 2 below, specific tests using synthetic 
recombinant proteins can be effectively used to differentiate HIV-1 and HIV-2 infections. 



45 
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Table 2 



Specimen 


Test 1 


Test 2 




Endpoint 


Endpoint 




Titer 


Titer 


Chicago-AiDS-1 


256 


<1 


Chicago-AlDS-2 


512 


<1 


Chicago-AIDS-3 


512 


<1 


Chicago-Asymptomatic-4 


1024 


<1 


Chicago-Asymptomatic-5 


2048 


<1 


Chicago-Asymptomatic-6 


512 


<1 


West Africa- 1 


<1 


2048 


West Africa-2 


<1 


64 


Portugal-1 


<1 


512 



Biological samples which are easily tested by the methods of the present invention include human and 
animal body fluids such as whole blood, serum, plasma, urine, saliva, stools, lymphocyte or cell culture 
preparations and purified and partially purified immunoglobulins. The polypeptides and fragments described 
herein can be used to determine the presence or absence of antibodies to HIV-1 and HIV-2 antigens by 
assay methods known to those skilled in the art, and for distinguishing between HIV-1 and HIV-2 infections. 

One such assay involves: 

a) coating a solid support with the polypeptides and polypeptide fragments disclosed herein; 

b) contacting the coated solid support with the biological sample to form an antibody polypeptide 
complex; 

c) removing unbound biological sample from the solid support; 

d) contacting the complex on the solid support with a labeled immunoglobulin specific for the 
antibody; and 

e) detecting the label to determine the presence or absence of HIV-1 and/or HIV-2 antibodies in the 
biological sample. 

A second assay method involves: 

a) coating a solid support with the polypeptides and polypeptide fragments disclosed herein; 

b) contacting the coated solid support with the biological sample and the homologous polypeptides 
conjugated to a label; 

c) removing unbound biological sample and unbound labeled polypeptide; and 

d) detecting the label to determine the presence or absence of HIV-1 and/or HIV-2 antibodies in the 
biological sample. 

Solid supports which can be used in such immunoassays include wells of reaction trays, test tubes, 
beads, strips, membranes, filters, microparticles or other solid supports which are well known to those 
skilled in the art. Enzymatic, radioisotopic, fluorescent, chemiluminescent and colloidal particle labels can 
be used in the aforementioned assays. Furthermore, hapten/labeled anti-hapten systems such as a 
biotin/labeled anti-biotin system can be utilized in the inventive assays. Both polyclonal and monoclonal 
antibodies are useful as reagents, and IgG as well as IgM class HIV antibodies may be used as solid 
support or labeled reagents. 

It is evident from the foregoing examples that one skilled in the art could clone together specific 
subfragments of the synthetic genes constructed to generate new synthetic genes that would have the 
same characteristics as those illustrated herein. For example, the c-term gp120 subfragment, BS2-10 and 
subfragment 413-1 can be cloned together to produce synthetic gene products useful as diagnostic and 
therapeutic reagents for AIDS. 

Claims 

1. A polypeptide comprising an amino acid sequence represented by the following: 
MGDPMMRDNWRSELYKYKWKIEPLGIAPTKAKRRWQREKRADLAVGILGALFLGFLGAAGSTMGARSL 
TLWQARQLLSGIVQQQNNLLRAIKDPKAQQHLLQLTVWGIKQLQARVLAVERYLKDGQLLGIWGCSGKL !CT- 
TAVPWNASWSNKSLEDIWNNMTWMQWEREINNYTNLIYSLLEESQNQQEKNEQELLQLDKWVDASLW 
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NWSNITKWLWY1KLFIMIVGGLAGLRIVFAVLSIVNRVRQGYSPLSFQTRLPNPRGPDRPEGIDEEGGER 
DRDRSTRLVDISLALVWEDLRSLCLFSYHRLRDLLLIATRIVELLGRRGWEVLKYWWNLLQYVSQELKNS AVSL- 
VNATAIAVAEGTDRVIEWQRAYRAIRHIHRRIRQGLERILLQVHASSLESSWQFGPG. 

2 A polypeptide comprising an amino acid sequence represented by the following: 
MSLKIYSSAHGRHTRGVFVLGFLGFLATAGSAMGAASLTVSAQSRTLLAGIVQQQQQLLDWKRQQELLR 
LTVWGTKNLQARVTAIEKYLQDQARLNSWGCAFRQVCHTTVPWVNDSLAPDWDNMTWQEWEKQVRYLEAN 
ISKSLEQAQIQQEKNMYELQKLNSWDIFGNWFDLTSWVKYIQYGVLIIVAVIALRIVIYWQMLSRLRKG YRPVFS- 
SPPGYIQQIHIHKDRGQPANEETEEDGGSNGGDRYWPWPIAYIHFLIRQLIRLLTRLYSICRDL 
LSRSFLTLQLIYQNLRDWLRLRTAFLQYGCEWIQEAFQAAARATRETLAGACRGLWRVLERIGRGILAVP RRIR- 

QGAEIALLVPSSWQFGPG. 

3. A polypeptide fragment of the polypeptide of Claim 2 comprising an amino acid sequence 

represented by the following: 

YSSAHGRHTRGVFVLGFLGFLATAGSAMGAASLTVSAQSRTLLAGIVQQQQQLLDWKRQQELLRLTVWG 
TKNLQARVTAIEKYLQDQARLNSWGCAFRQVCHTTVPW 

4. The polypeptide of Claims 1 , 2, or 3 produced by E. coli. 

5. A fusion polypeptide comprising a polypeptide as in one of Claims 1-3, in which the polypeptide is 
fused to a prokaryotic or eukaryotic protein. 

6. The fusion polypeptide of Claim 5 wherein said prokaryotic or eukaryotic protein is the E. coli 

enzyme CKS. 

7 A synthetic gene comprising a DNA sequence represented by the following: 
ATGGGGGATCCCATGATGCGCGACAACTGGCGCTCTGAACTGTACAAATACAAAGTTGTTAAAATCGAAC 
CGCTGGGCATCGCTCCGACCAAAGCTAAACGCCGCGTTGTTCAGCGCGAAAAACGCGCAGATCTAGCTGT 
TGGTATCCTGGGTGCTCTGTTTCTGGGTTTTCTGGGTGCTGCTGGTTCTACTATGGGTGCTCGCTCTCTG 
ACTCTGACTGTTCAGGCTCGCCAGCTGCTGTCTGGTATCGTTCAGCAGCAGAACAACCTGCTGCGCGCTA 
TCAAGGATCCCAAAGCTCAGCAGCATCTGCTGCAACTGACTGTTTGGGGTATCAAACAACTGCAGGCTCG 
CGTTCTGGCTGTTGAACGCTACCTGAAAGACCAGCAGCTGCTGGGTATCTGGGGTTGCTCTGGTAAACTG 
ATTTGCACTACTGCCGTTCCGTGGAACGCTTCTTGGTCCAACAAATCTCTGGAAGACATCTGGAACAACA 
TGACTTGGATGCAATGGGAACGCGAAATCAACAACTACACTAACCTGATCTACTCTCTGCTGGAAGAATC 
TCAGAACCAGCAGGAAAAAAACGAACAGGAACTGCTGCAACTGGACAAATGGGTCGACGCTTCTCTGTGG 
AACTGGTCTAACATAACTAAATGGCTGTGGTACATCAAACTGTTTATCATGATCGTTGGTGGTCTGGCCG 
GCCTGCGCATCGTTTTTGCTGTTCTGTCTATCGTTAACCGCGTTCGCCAGGGTTACTCTCCGCTGTCTTT 
TCAGACTCGCCTGCCGAACCCGCGCGGTCCGGACCGCCCGGAAGGTATCGATGAAGAAGGTGGTGAACG- 

C 

GACCGCGACCGCTCTACTCGCCTGGTAGATATCTCTCTGGCTCTGGTTTGGGAAGACCTGCGCTCTCTGT 

GCCTGTTTTCTTACCATCGCCTGCGCGACCTGCTGCTGATCGCTACTCGCATCGTTGAACTGCTGGGTCG 

CCGCGGTTGGGAAGTGCTGAAATACTGGTGGAACCTGCTGCAATACGTATCTCAGGAACTGAAAAACTCT 

GCTGTTTCTCTGGTTAATGCTACTGCTATCGCTGTTGCTGAAGGTACTGACCGCGTTATCGAAGTTGTTC 

AGCGCGCTTACCGCGCTATCCGCCATATCCATCGCCGCATCCGCCAGGGTCTGGAACGCATCCTGCTGCA 

GGTGCATGCCTCGAGTCTAGAAAGCTCATGGCAATTCGGGCCCGGGTAA - 

8. The synthetic gene of Claim 7 coding for the polypeptide of Claim 1 . 

9 A synthetic gene comprising a DNA sequence represented by the following: 
ATGAGCTTAAAGATCTACTCTTCCGCTCACGGCCGTCACACCCGTGGCGTTTTCGTTCTGGGCTTCCTGG 
GCTTCCTGGCTACCGCGGGCTCCGCTATGGGCGCTGCTTCCCTGACCGTTTCCGCTCAGTCCCGTACCCT 
GCTGGCTGGCATCGTTCAGCAGCAGCAGCAACTTCTAGACGTTGTTAAACGTCAGCAGGAGCTCCTGCGT 
CTGACCGTTTGGGGCACCAAAAACCTGCAGGCTCGTGTTACCGCTATCGAAAAATACCTGCAGGACCAGG 
CTCGTCTGAATTCCTGGGGCTGCGCTTTCCGTCAGGTTTGCCACACCACCGTTCCATGGGTTAACGATTC 
CCTGGCTCCGGACTGGGACAACATGACCTGGCAGGAATGGGAAAAACAGGTTCGTTACCTGGAAGCTAAC 
ATCTCCAAATCCCTGGAACAGGCTCAGATCCAGCAGGAAAAAAACATGTACGAACTGCAGAAACTGAACT 
CCTGGGATATCTTCGGCAACTGGTTCGACCTGACCTCCTGGGTTAAATATATCCAGTACGGCGTGCTCAT 
CATCGTTGCTGTTATCGCTCTGCGTATCGTTATCTACGTAGTTCAGATGCTGTCCCGTCTGCGTAAAGGC 
TACCGTCCGGTTTTCTCTTCCCCCCCGGGCTATATCCAGCAGATCCATATCCACAAAGACCGTGGCCAGC 
CGGCTAACGAAGAAACCGAAGAAGACGGCGGATCCAACGGCGGCGACCGTTACTGGCCGTGGCCGATCG- 
C TTATATCCACTTCCTGATCCGTCAGCTGATCCGTCTGCTGACCCGTCTATACTCCATCTGCCGTGACCTG 
CTGTCCCGTTCCTTCCTGACCCTGCAACTGATCTACCAGAACCTGCGTGACTGGCTGCGTCTGCGTACCG 
CTTTCCTGCAGTACGGCTGCGAATGGATTCAGGAAGCATTCCAAGCGGCCGCTCGTGCTACCCGTGAAAC 
CCTGGCTGGCGCATGCCGTGGCCTGTGGCGTGTTCTGGAACGTATCGGCCGTGGTATCCTGGCTGTTCC- 

Q 

CGTCGTATCCGTCAGGGCGCCGAAATCGCTCTGCTGGTACCAAGCTCATGGCAATTCGGGCCCGGGTAA 
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10. The synthetic gene of Claim 9 coding for the polypeptide of Claim 2. 

11. A synthetic gene comprising a DNA sequence represented by the following: TACTCTTCCGCTCAC- 
GGCCGTCACACCCGTGGCGTTTTCGTTCTGGGCTTCCTGGGCTTCCTGGCTACCG CGGGCTCCGCTAT- 
GGGCGCTGCTTCCCTGACCGTTTCCGCTCAGTCCCGTACCCTGCTGGCTGGCATCGT TCAGCAGCAG- 
CAGCAACTTCTAGACGTTGTTAAACGTCAGCAGGAGCTCCTGCGTCTGACCGTTTGGGGC ACCAAAAAC- 
CTGCAGGCTCGTGTTACCGCTATCGAAAAATACCTGCAGGACCAGGCTCGTCTGAATTCCT 
GGGGCTGCGCTTTCCGTCAGGTTTGCCACACCACCGTTCCATGG 

12. The synthetic gene of Claim 11 coding for the polypeptide of Claim 3. 

13. A fusion polypeptide comprising a polypeptide as in one of Claims 1-3, in which the polypeptide is 
fused to a HIV gag protein. 

14. The fusion polypeptide of Claim 13 wherein said HIV gag protein is an HIV-1 gag protein comprising 
an amino acid sequence represented by the following: DTGHSSQVSGNYPIVQNIQGGMVHQAISPRTL- 
NAWVKWEEKAFSPEVIPMFSALSEGATPGDLNTMLNT VGGHQAAMQMLKETINEEAAEWDRVHPVHAG- 
PIAPGGMREPRGSDIAGTTSTLGEQIGWMTNNPPIPVGE IYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFR- 
DYVDRFYKTLRAEQASQEVKNWMTETLLVQNANPDC KTILKALGPAATLEEMMTACGGVGGPGHKARV- 
LAEAMSQVTNTATIMMGRGNFRNQRKMVKCFNCGKEGH TARNCRA 

15. A method for detecting antibodies to HIV antigens in an individual which comprises the steps of: 

a) obtaining a sample of a body fluid from the individual: 

b) incubating said body fluid with said polypeptide of Claims 1 , 2 or 3; 

c) incubating said body fluid with a labeled antibody to immunoglobulin; and 

d) detecting said label and determining therefrom the presence or absence of antibodies to HIV 
antigens. 

16. A method for detecting antibodies to HIV antigens in an individual which comprises the steps of: 

a) obtaining a sample of a body fluid from the individual: 

b) incubating said body fluid with said polypeptide of Claims 1, 2 or 3; 

c) incubating said body fluid with a labeled antigen; and 

d) detecting said label determining therefrom the presence or absence of antibodies to HIV antigens. 
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Clustered order of selected sequences: 



2. CDC42FRAG.PEP 

3. BH102FRAG.PEP 

4. SF2FRAG .PEP 
1. MALFRAG .PEP 

5. SYNFRAG. PEP 



2 
3 

*T 

1 

5 



'1-107, 
1-107, 
1-107 
! 1-107' 
(1-107) 



KQLQARVLAVERYLKDQQLLGIWGCSGKL1CTTAVPWNASWSNKSLEDIWNNMT 



MM 



1 E 

1 EAQQB 
1 EAOQW 



VW 



1 KAQQHLLQLTVWGI 



WdNM 
AnNM" 



4 69 wMQWERE 
1 69 WMQwEkE 



nlWsl 

dU"'nt 

111 . 
sNYTgi 



ytLIEESQNQQEKNqQELLqLDKW 

.YtMUAU i 

69 WMQWEREINNYTNLIYSLLEESQNQQEKNEQELLQLDKW 



2 69 WMEWDREIdNYThL 

3 69 UeJ/DRE 
iWERE 
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EcoRI Aval BflmlU I J : , e - . » 

1 clrrTCGAGCTCGfiT AciciGGGAICCCATGatocgcaacaactggcgctctoaactgtacflaa Ucfla 69 
AsnSerSerSerVa IProG lyAspProMEfltE I ArgAspAsn! rpArgSerGl uleuTyrLys Ty rLy 
2 10 23 

20 

C-torm gpl20 

70 aqttqttaaaa tcqaaccgctgqqea tcqctccgaccaaaqctaaacgccgcqttgttcagcgcoaaaa 138 
sVn IVal lys 1 1 eC I uProlcuGl y I leM aProl hrLysA I aLysArgArgVa I ValGlnArgGluLy 



Bglll 
^aGATCI 



139 acgcoc3GATCTAgctqttga ta tcctaqatoctctgtttcLgaottttctgQatgctQctqottctac 207 
sArgAlaAspLeuAJaValGlyl leLeiiGIyAlaLeuPheLeuGlyPheLeuGlyAlaAlaGlySerrh 
146 

415 

208 Utqagtgctcgctctctaactctgactqttcagqctcgccagctgctqtctaata tcgttcagcagca 276 
rMElGlyAI a ArgSerLeuI hrLeuT hrVa 1G1 nA I aArgGl nleuLeuSerGly I 1 eVa 1 G I nG 1 nG I 

Qamltl 



277 aaacaacctgctgcgcpctatcAAGGATcccaaaactcagcagcatctgctgcaactgactpt ttgggq 345 
nAsnAsnLeuLeuArgAlal 1 eLysAspProlysAUGInGlnHisleuleuGlnLeuIhrVa 1 Trpol 
302 

ase-iQ 

346 tatcaaacaactgcaqgctcgcgttctgactgttoaacgctacctgaaaqaccagcagctgctgggtat 4 14 
yMeLysGInLeuGlnAl aArgValleuAlaValGluArgryrLeuLysAspGlnGl nLsuLeubly 1 1 

4 15 ctgggottgctctgottiaactgatttgcactactaccgttccgtggaacqcttcttggtccaacaaatc 483 
eTrpGlyCysSerGlyLysLeuIleCysIhrThrAlaValProrrpAsnAlaSerTrpSerAsnLysSe 

484 tctggaaoaca tctgoaacaacataacttggatocaa tgggaacgcgaaa tcaacaactacactaacct 552 
rleuGluAspl leTrpAsnAsnMEl ThrfrpMElGlnrrpGl uArgGluI leAsnAsnTyrihrAsnLe 

553 getctactctctgctoqaaqaatctcaoaaccagcagqaaaaaaacoaacagqaactgctqcaactgaa 621 
ul 1 cfyrSerLeuLeuGl uGl uSerGlnAsnGlnGl nbluLysAsnbiuGlnGluLeuLeuGI nLeuAs 

}*JJ_ fn-i __ ^ 

622 caaato^qtcGACacttctctgtagaactggtctaacataactaaatggctgtggtacatcaaactgtt 690 U_) 

plysTrpVal AspAUSerLeuI rpAsn I rpSerAsn UeThrLysT rpteuTrpTyr 1 1 elysLeuPh ^ 

630 Z^> 

Hpal it 



69 I tatcatoa tcgttagtqgtctgoccaacctgcgcatcgtttttgctgttctgtctatcgttaaccgcqt 759 
ell eMEf I leVal G lyGlyLeuAl aGlyleuArg 1 1 eVa 1 PheAl aVal LeuSer I 1 eVa 1 AsnArg\f a 

413-2 

760 tcgccagoattactctccgctgtcttttcagactcgcctgccgaacccgcgcqqtccgqaccgcccgoa 828 
lArgGlnGlyTyrSerProLeuSerPheGl n ThrArgLeuProAsnProArgG 1 yProAspArgProb I 



EcpRV 



829 agatatcgataaacaaggtaa taaacgcqaccgcgaccgctctactcgcctggtaga tatctctctgoc 89 7 
uGlyll eAspGluGluGiyGlyGluArgAspArgAspArgSerThrArgLeuVa I As pi USerLeuAl 

ad? 



887 



898 tctggtttgggaaqacctgcgctctctqtgcctgttttcttaccatcgcctgcgcoacctqctgctqa t 966 
aLeuVa \ frpG 1 uAspLeuArgSerLeuCysLeuPheSerTyrH I sArgLeuArgAspLeuLeuteu { \ 

967 cgctactcgca tcgttqaactgctqgotcgccgcqgttggqaagtactqaaa tactggtggaacctact 1035 
eA 1 alhrArgi leValGluleuLeuuiyArgArgGJy frpG 1 uVa I LeuLys TyrT rpl rpAsnLeule 



— f 



naB I 



1036 gcaatacqtatctcagqaactgaaaaactctgctqtttctctgqttaatoctactgcts tcactgttgc 1 104 
uGl nTyrVa 1 SerGInG J uLeulysAsnSerAI a Va I SerleuVa! AsnAl a IhrAl a I leAlaValAl 



1043 



1 105 tgaaaqtactgaccgcgtLatcqaagttgttcagcgcgcttaccgcactatccgccatatccatcgccg 1173 
aGluGlylhrAspArgVal 1 1 eGl uVa 1 Va 1GI nArgAl a TyrArgA I al 1 eArgfl I s I led i sArgAr 



Aval HlndHI 

:1cgagtctagaa1g 



1174 catccgccagqgtctgqaacgcatcctgctgCAGGTGCArGCCTCGAGTCTAGAAAGCTT 1233 
glleArgGlnGlyLeuGluArglleLeuLeuGlr>Vall(}sAlaSerSert.cuGluS|r^ 
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Clustered order of selected sequences: 
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PSQ301 . PEP 

in ?fl 30 40 50 60 70 

MGOPMMRDNW RSELYKY KVV KIEPLGIAPT KAKRRVVQRE KRADLAVGIL GALFLGFLGA AGSTHGARSL 
linksrf-HIV-1 env seq—>- Q nQ +++ 12 0 130 140 

TLTVQARQLL SGIVQQQNNL LRAIKDPKAQ QHLLQLTVWG IKQLQARVLA VERYLKDQQL LGIWGCSGKL 

icn ifin +++ 170 180 190 200 210 

ICTTAVPWNA SWSNKSLEDI WNNMTWMQWE REINNYTNLI YSLLEESQNQ QEKNEQELLQ LDKWVDASLW 

oon ?m 2*0 250 260 270 280 

NWSNITKWLW YIKLFIMIVG GLAGLRI VFA VLSIVNRVRQ GYSPLSFQTR LPNPRGPORP EGIDEEGGER 

onn inn * 310 320 330 340 350 

DRDRSTRLVD ISLALVWEOL RSLCLFSYHR LRDLLLIATR IVELL6RRGW EVLKYWWNLL QYVSQELKNS 

Tfin * 370 380 390 400 410 

AVSLVNATAI AVAEGTDRVI EVVQRAYRAI RHIHRRIRQG LERILLQVHA SSLESSWQFG PG . 

p- linker seq 

PSD3Q2. PEP 

90 30 40 50 60 70 

MTMITPSLAA GPDTGHSSQV SQNYPIVQNI QGQHVHQAIS PRTLNAWVKV VEEKAFSPEV IPMFSALSEG 

linker seq j-HIV-1 MS seq— >- 11Q 12Q 130 _ 140 

ATPQDLNTML NTVGGHQAAM QMLKETINEE AAEWDRVHPV HA6PIAPGQM REPRGSDIAG TTSTLQEQIG 

icn isn 170 180 190 200 210 

WMTNNPPIPV GEIYKRWIIL GLNKIVRMYS PTSILDIRQG PKEPFROYVD RFYKTLRAEQ ASQEVKNWMT 

oon i-in ■ ?an 250 260 270' 280 

ETLLVQNANP DCKTILKALG PAATLEEMMT ACQGVGGPGH KARVLAEAMS QVTHTATIHM QRGNFRNQRK 



MVKCFNCGKE GHTARNCRAP GDPMMRDNWR SELYKYKVVK IEPLGIAPTK AKRRVVQREK RADLAVG I LG 

; «n Jn linker ^ HIV 3aV nV Seq ~390 400 410 420 

ALFLGFLGAA GSTMGARSLT LTVQARQLLS GIVQQQNNLL RAIKDPKAQQ HLLQLTVWGI KQLQARVLAV 

ERYLKDQQLL GIWGCSGKLI CTTAVPWNAS WSNKSLEDIW NNMTWMQWER EINNYTNLI? SLLEESQNQQ 

cnn = m q?0 530 540 550 560 

EKNEQEILQL DKWVDASLWN WSNITKWLWY IKLFIMIVGG LAGLRIVFAV LSIVNRVRQG YSPLSFQTRL 

c-rn ++ E^an <590 600 * 610 620 630 

PMPRGPORPE GIDEEGGERD RORSTRLVDI SLALVWEDLR SLCLFSYHRL RDLLLIATRI VELLGRRGWE 

cn * sen fifiO 670 680 690 700 

VLKYWWNLLQ YVSQELKNSA VSLVNATAIA VAEGTDRVIE VVQRAYRAIR HIHRRIRQGL ERILLjJVHAS^ 

RVIN " FIGURE 7 



320 330 340 _ , 350 
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Sa 1 I Hind! II _ 

I alcaacctacagccaagctta.iiiga tcTACTCTTCCGCTCACGGCCGTCACACCCGTGGCGTTTTCfir I 69 

2 16 

fragment A — 

70 ^GGCTTCCTGGGCTTCCTGGCTACCGCGGGCTCCGCTAIGGGCGCTCCTTCCCTGACCGTTTCCGCT 138 

139 cTgTcCCGTACCCTGCTGGCTGGCATCGTICAGCAGCAGCAGCAACTTCIAGACGTTGTT^ 207 
GlnSerArglhrLeuLeuAiaGlylleVaiGlnGlnGInGlnGln 

7nn rArrArrTrrTGCGTCTGACCGTTTGGGGCACCAAAAACCTGCAGGCTCGTGTTACCGCTATCGAAAAA 276 . 
?7T T . rrTrrArrrt rrAnnCTCGTCTGAAT TCCTGGGGCTGCGCTTTCCGTCAGGTTTGCCACACCACCG IT 345 

277 {ySEISStSAJScOT 



MCOI 



infi C^aTgGGT TAACGAT TCCCTGGCTCCGGACTGGGACAACATGACCTGGCAGGAATGGGAAAAACAGGT T 4 14 
3 6 ProIrpValAsnAsoSerLe^ 



317 

fragment D 



415 CGTTACCTGGAAGCTAACATCTCCAAAfCCCTG^ 483 
ArgryrLeuGiuAiaAsnlleSerLysSerLeuGluGlnAUGlnl leGlnGlnGluLysAsnntf lyr 

EcoRV _ 

484 GAACTGCAGAAACTGAACTCCTGGGAlArCTTCGGCAACTGGTTCGACC^ 552 
GtuLeuGlnLysLeuAsnSerTrpAspIlePheGlyAsnlrpPheAspLeuIhrSerrrpValLysfyr 



511 

SnaBI 



553 ATCCAGTACGGCGTGCTCATCATCGTTGCTGTTAICGCTCTGCGTATCGTTATCT^ 621 
HeGlnlyrGlyValLeuIIelleValAlaVal IleAUleuArg 1 1 eVa H leTyrVa 1 Va 1 G) nn&i 



622 CTGTCCCGTCTGCGTAAAGGCTACCGTCCGGTTT ^TCT TCCCCCCCGGGCTATATCCAGCAGATCCAT 690 
Leu5erArgLeuArgLysGlyTyrArgProVaIPheSer5erProProG1yryrI1eGlnG1niletlis 



Oacnlll 



691 ATCCACAAAGACCGTGGCCAGCCGGCTAACGAAGAAACCGAAGAAGAC 759 
I I ell i s LysAspArgGlyG I nProA 1 aAsnG I uGlu thrGl uGluAspGI yGI ySerAsnG I yGl yAsp 



743 

Fragment C 



760 CGTTACTGGCCGTGGCCGATCGCTTATMCCACTTCCTGATCCGTCAGCTGATCCGTCTGCTGACCCGT 828 
ArgfyrTrpProTrpProIleAUTyrl leHisPheLeuUeArgGlnLeuIleArgLeuLeullirArg 

829 CTa TACTCCATCTGCCGTGACCTGCTGTCCCGTTCCTTCCTGACCCTGCAAC7GATCTACCAGAACCTG 697 
LiulyrSerlleCysArgAspLeuLeuSerArgSerPheLeurhrLeuGlnLsuIleTyrGlnAsnteu 

090 CGTGACTGGCTGCGTCTGCGTACCGCTTTCCTGCAGTACGGCTGCGAATGGATTCAGGAAGCAr rCCAa 966 
ArgAspTrpLeuArgLeuArgihrAlaPheLeuGlnlyrGlyCysGluTrpMeGJnGluAlaPlieGIn 

967 GCGGCCGCTCGTGCTACCCGTGAAACCCTGGCTGGCGCATGCCGTGGCCTGTGGCGTGTTCTGGAACGT 1035 
AlaAlaAUArgAlalhrArgGluThrLeuAlaGlyAUCysArgGlyLeuIrpArgValLeuGluArg 



Asp718I 



1036 AICGGCCGTGGTATCCTGGCTGTTCCGCGTCGTATCCGTCAGGGCGCCGAAATCGCTCTGCTGg ^acca 1104 
IleGlyArgGlyLleLeoAlaValProArgArglleArgGlnGlyAlaGluf 1 eAl aleuleuVa IPro 



HindU I 

1105 agctt 1109 
Ser 
1105 
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PSD306. PEP 

m 20 30 40 50 60 70 

MSLKIYSSAH GRHTRGVFVL GFLGFLATAG SAMGAASLTV SAQSRTLLAG IVQQQQQLLD VVKRQQEllR 

inker h HIV-Z TMP seq-> ^ nQ lzQ m 140 

LTVWGTKNLQ ARVTAIEKYL QDQARLNSWG CAFRQVCHTT VPWVNDSLAP DWDNMlWQEW EKQVRYLEAN 

icn 160 170 130 190 200 210 

ISKSLEQAQI QQEKNMYELQ KLNSWDIFGN WFDLTSWVKY IQYGVLIIVA VIALRIVIYV VQMLSRLRKG 

??n 230 240 250 260 270 280 

YRPVFSSPPG YIQQIHIHKD RGQPANEETE EDGGSNGGDR YWPWPIAYIH FLIRQLIRLL TRLYSICRDL 

■pan 30Q 310 320 330 340 350 

LSRSFLTLQL IYQNLRDWLR LRTAFLQYGC EWIQEAFQAA ARATRETLAG ACRGLWRVLE RIGRGI LAVP 

360 370 
RRIRQGAEIA LLVPSSWQFG PG. 

|l inker 

PSD307 . PEP 

in 20 30 40 50 60 70 

MTHITPSLAA GPDTGHSSQV SQNYPIVQNI QGQMVHQAIS PRTLNAWVKV VEEKAFSPEV IPMFSALSEG 

linker seq j-HIV-1 oag seq >- Q o no 120 130 _ 140 

ATPQDLNTML NTVGGHQAAH QMLKETINEE AAEWDRVHPV HAGPIAPGQM REFRGSDIAG T l STLQEQIG 

icn ifio 170 180 190 200 210 

WMTNNPPIPV GEIYKRWIIL GLNKIVRMYS PTSILDIRQG PKEPFRDYVD RFYKTLRAEQ ASQEVKNWMT 

nnn 2 30 240 250 260 270 280 

ETLLVQNANP DCKTILKALG PAATLEEMMT ACQGVGGPGH KARVLAEAMS QVTMTATIHM QRGNFRNQRK 

oon ?nn 310 320 330 340 350 

MVKCFNCGKE GHTARNCRAL DLQPSLKIYS SAHGRHTRGV FVLGFLGFLA TAGSAMGAAS LTVSAQSRTL 

I — linker — — HIV-2 TMP seq — >• 

7fin ?7f) 380 390 400 410 4<iU 

LAGlVQQQQQ LLDVVKRQQE LLRLTVWGTK NLQARVTAI E KYLQOQARLN SWGCAFRQVC HTTVPWVNDS 

d?n 440 450 460 470 480 490 

LAPDWONMTW QEWEKQVRYL EANISKSLEQ AQIQQEKNMY ELQKLNSWDI FGNWFDLTSW VKYIQYGVLI 

c nn cin 520 530 540 550 560 

IVAVIALRIV IYVVQMLSRL RKGYRPVFSS PPGYIQQIHI HKDRGQPANE ETEEDGGSNG GORYWPWPIA 

c 70 con 590 600 610 620 630 

YIHFLIRQLI RLLTRLYSIC RDLLSRSFLT LQLIYQNLRD WLRLRTAFLQ YGCEWIQEAF QAAARATRET 

640 650 660 670 

LAGACRGLWR VLERIGRGIL AVPRRIRQGA EIALLVRVIN . 

\-} inker 

PEP: 

FIGURE 14 
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© Synthetic DNA derived recombinant HIV antigens. 



© The present invention provides a method of syn- 
thesizing genes encoding unique HIV-1 and HIV-2 
envelope proteins and their fragments, thereby al- 
lowing overexpression of these proteins in E. coli. 
The HIV envelope proteins and their fragments have 



been expressed at high levels as individual proteins 
or in fusion with other proteins. The HIV envelope 
proteins thus expressed in E. coli can be effectively 
used for the detection of exposure to HIV as well as 
the discrimination of HIV-1 and HiV-2. 
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Clustered order of selected sequences: 

2. CDC42FRAG.PEP (1-107) 

3. 8H102FRAG.PEP 1-107 

4. SF2FRAG .PEP (1-107) 
1. MALFRAG, PEP (1-107) 

5. SYNFRAG. PEP . (1-107) 
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